Hybrid networks based on RBFN and GMM for speaker recognition
نویسندگان
چکیده
In this paper, a hybrid network based on the combination of Radial Basis Function Networks (RBFNs) and Gaussian Mixture Models (GMMs) is proposed and used for speaker recognition. The hybrid network is a hierarchical one, where a GMM is built for each speaker and an RBFN is built for each group of speakers. The GMMs and RBFNs are trained independently. The RBFNs are used as a rst stage coarse classi er and the GMMs are used as the nal classi er. For each RBFN, only the rst several candidates are chosen to take part in the nal classi cation. The hybrid system is used for the SPIDRE database speaker recognition. Some experiments were carried out to choose the proper structure and parameters of RBFNs and GMMs. After using RBFNs, about 40% speakers were excluded without decreasing the performance. If the most confusable speaker sets in GMMs are grouped into RBFNs, the performance of GMMs can be increased more by using RBFNs.
منابع مشابه
Hybrid Network Based on Rbfn and Gmm for Speaker Recognition
In this paper, a hybrid network based on the combination of Radial Basis Function Networks (RBFNs) and Gaussian Mixture Models (GMMs) is proposed and used for speaker recognition. The hybrid network is a hierarchical one, where a GMM is built for each speaker and an RBFN is built for each group of speakers. The GMMs and RBFNs are trained independently. The RBFNs are used as a rst stage coarse c...
متن کاملAutomatic Speech Emotion and Speaker Recognition based on Hybrid GMM and FFBNN
In this paper we present text dependent speaker recognition with an enhancement of detecting the emotion of the speaker prior using the hybrid FFBN and GMM methods. The emotional state of the speaker influences recognition system. Mel-frequency Cepstral Coefficient (MFCC) feature set is used for experimentation. To recognize the emotional state of a speaker Gaussian Mixture Model (GMM) is used ...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملDay-ahead Price Forecasting of Electricity Markets by a New Hybrid Forecast Method
Energy price forecast is the key information for generating companies to prepare their bids in the electricity markets. However, this forecasting problem is complex due to nonlinear, non-stationary, and time variant behavior of electricity price time series. Accordingly, in this paper a new strategy is proposed for electricity price forecast. The forecast strategy includes Wavelet Transform (WT...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کامل